Bagging-based System Combination for Domain Adaptation
نویسندگان
چکیده
Domain adaptation plays an important role in multi-domain SMT. Conventional approaches usually resort to statistical classifiers, but they require annotated monolingual data in different domains, which may not be available in some cases. We instead propose a simple but effective bagging-based approach without using any annotated data. Large-scale experiments show that our new method improves translation quality significantly over a hierarchical phrase-based baseline by 0.82 BLEU points and it's even higher than some conventional classifier-based methods.
منابع مشابه
Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملBiased Bagging for Unsupervised Domain Adaptation
Unsupervised Domain Adaptation (DA) is used to automatize the task of labeling data: an unlabeled dataset (target) is annotated using a labeled dataset (source) from a related domain. We cast domain adaptation as the problem of finding stable labels for target examples. A new definition of label stability is proposed, motivated by a generalization error bound for large margin linear classifiers...
متن کاملBagging and Boosting statistical machine translation systems
a r t i c l e i n f o a b s t r a c t In this article we address the issue of generating diversified translation systems from a single Statistical Machine Translation (SMT) engine for system combination. Unlike traditional approaches, we do not resort to multiple structurally different SMT systems, but instead directly learn a strong SMT system from a single translation engine in a principled w...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملA Numerical Design Technique for a Relay - Type Feedback Control System
An efficient numerical method for the design and synthesis of compensator for a relay type control system is developed and discussed. Previous works based on the interactive graphic method are reviewed and it is shown that the combination of the trequency and time domain numerical techniques provide a powerful tool in design of a wide class of relay control systems. An example is presented to d...
متن کامل